Mining E-mail Authorship
نویسنده
چکیده
In this paper we report an investigation into the learning of authorship identification or categorisation for the case of e-mail documents. We use various e-mail document features such as structural characteristics and linguistic evidence together with the Support Vector Machine as the learning algorithm. Experiments on a number of e-mail documents give promising results with some e-mail document features and author categories giving better categorisation performance results.
منابع مشابه
A Novel Approach of Mining Write-Prints for Authorship Attribution in E-mail Forensics
There is an alarming increase in the number of cybercrime incidents through anonymous e-mails. The problem of e-mail authorship attribution is to identify the most plausible author of an anonymous e-mail from a group of potential suspects. Most previous contributions employed a traditional classification approach, such as decision tree and Support Vector Machine (SVM), to identify the author an...
متن کاملGender-Preferential Text Mining of E-mail Discourse
This paper describes an investigation of authorship gender attribution mining from e-mail text documents. We used an extended set of predominantly topic content-free e-mail document features such as style markers, structural characteristics and gender-preferential language features together with a Support Vector Machine learning algorithm. Experiments using a corpus of e-mail documents generate...
متن کاملE-mail authorship attribution using customized associative classification
E-mail communication is often abused for conducting social engineering attacks including spamming, phishing, identity theft and for distributing malware. This is largely attributed to the problem of anonymity inherent in the standard electronic mail protocol. In the literature, authorship attribution is studied as a text categorization problem where the writing styles of individuals are modeled...
متن کاملLanguage and Gender Author Cohort Analysis of E-mail for Computer Forensics
We describe an investigation of authorship gender and language background cohort attribution mining from e-mail text documents. We used an extended set of predominantly topic content-free e-mail document features such as style markers, structural characteristics and gender-preferential language features together with a Support Vector Machine learning algorithm. Experiments using a corpus of e-m...
متن کاملVisualizing IKAT Co-Authorship Networks by Text Mining MICC-IKAT Annual Reports 1993–2005
This short paper presents two experiments that address visualization of correlated authors which have been obtained through text mining IKAT annual reports. By generating a 2-dimensional topology that reflects the structure underlying the extracted features, a person investigating this information can oversee the information more efficiently. Similar techniques can be used in financial fraud in...
متن کامل